Extracting Information from Conference Announcements: High Recall, High Precision
نویسنده
چکیده
Recall, High Precision Kevin Cheong Language Technology Group, Microsoft Research Institute School of MPCE, Macquarie University Sydney NSW 2109, Australia [email protected] Abstract Conference announcements are distributed widely each day via electronic mail to the research and industrial community. These conferences inform researchers, academics and the industry about the research and development (R & D) work performed in a particular eld of interest. There is a wealth of information contained in this multitude of conference announcements. The aim of this research is to extract essential and relevant information from conference announcements and to explore the technologies involved. In this paper we describe an architecture for a system we have developed that extracts relevant and useful information from conference announcement electronic mail messages, with a focus on achieving a high recall and precision rate. We also discuss the extent to which the success of this information extraction task depends on domain and world knowledge.
منابع مشابه
A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model
Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...
متن کاملApplied Text Analytics for Comments on News-Articles A Bachelor Thesis
Several on-line daily newspapers offer readers the opportunity to directly comment on articles. In the Netherlands this feature is used quite often and the quality (grammatically and content-wise) is surprisingly high. The paper develops techniques to collect, store, enrich and analyze these comments. After giving a high-level overview of the Dutch ‘commentosphere’ we zoom in on extracting the ...
متن کاملExtracting Protein-Protein Interactions from the Literature Using the Hidden Vector State Model
In the field of bioinformatics in solving biological problems, the huge amount of knowledge is often locked in textual documents such as scientific publications. Hence there is an increasing focus on extracting information from this vast amount of scientific literature. In this paper, we present an information extraction system which employs a semantic parser using the Hidden Vector State (HVS)...
متن کاملInformation Extraction for Call for Paper
This paper proposes a system called CFP Manager specialized on IT field and designed to ease the process of searching conference suitable to one’s need. At present, the handling of CFP faces two problems: for emails, the huge quantity of CFP received can be easily skimmed through. For websites, the reviewing of some of the main CFP aggregators available online points out the lack of usable crit...
متن کاملLiterature mining and database annotation of protein phosphorylation using a rule-based system
MOTIVATION A large volume of experimental data on protein phosphorylation is buried in the fast-growing PubMed literature. While of great value, such information is limited in databases owing to the laborious process of literature-based curation. Computational literature mining holds promise to facilitate database curation. RESULTS A rule-based system, RLIMS-P (Rule-based LIterature Mining Sy...
متن کامل